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ABSTRACT 

We present 2-D stellar kinematics of M87 out to R — 238" taken with the integral field spectrograph 
VIRUS-P. We run a large set of axisymmetric, orbit-based dynamical models and find clear evidence 
for a massive dark matter halo. While a logarithmic parameterization for the dark matter halo is 
preferred, we do not constrain the dark matter scale radius for an NFW profile and therefore cannot 
rule it out. Our best-fit logarithmic models return an enclosed dark matter fraction of 17.2^g Q% within 
one effective radius (R e = 100"), rising to AQAtH% within 2 R e . Existing SAURON data (R < 13"), 
and globular cluster kinematic data covering 145" < R < 554" complete the kinematic coverage to 
R = 47 kpc (~ 5 Re). At this radial distance the logarithmic dark halo comprises 

85.3^-5% of the 

total enclosed mass of 5.7±o"J x 10 12 M Q making M87 one of the most massive galaxies in the local 
universe. Our best- fit logarithmic dynamical models return a stellar mass-to-light ratio of 9-IIq' 2 , 
(V-band), a dark halo circular velocity of 8OO+25 km s _1 , and a dark halo scale radius of 36±g kpc. 

The stellar M/L, assuming an NFW dark halo, is well constrained to 8.201q'°q (V-band). The stars 
in M87 are found to be radially anisotropic out to R = 0.5 R e , then isotropic or slightly tangentially 
anisotropic to our last stellar data point at R = 2.4 R e where the anisotropy of the stars and globular 
clusters are in excellent agreement. The globular clusters then become radially anisotropic in the last 
two modeling bins at R = 3.4 R e and R = 4.8 R e . As one of the most massive galaxies in the local 
universe, constraints on both the mass distribution of M87 and anisotropy of its kinematic components 
strongly informs our theories of early-type galaxy formation and evolution in dense environments. 
Subject headings: galaxies: elliptical and lenticular, cD; galaxies: individual (M87, NGC4486); galax- 
ies: kinematics and dynamics 
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1. INTRODUCTION 

Dark matter is a central component of our cur- 
rent theory of large scale structure formation. Al- 
though the nature of dark matter is unknown, sig- 
nificant support for this cosmological pa radigm comes 
from well-motivated phys i cal ar g uments dGunn fc Gotti 
1971 IPress fc Schechterl fl97l IWhite fc ReesI 119781: 
Fillmore fc Goldrei ch 1984) and the remarkable agree- 
ment between N-body simulations of the growth of struc- 
ture (iFrenk et alJll985l iDavis et al.lll98a iNavarro et al.1 
[19951: ISpringel et al.l l2005 N ) and observations of the distri- 
bution of galaxies i n the local universe (|Davis et~aT1IT98l 
IColless et ani200l . 

With the increase in computational power seen over 
the past 30 years, the spatial resolution of numerical 
simulations has improved to the point where individual 
galaxies are well resol ved and their dark matter halos can 
be studied in detail (IMoore et al.lll99"8at iGhigna et al.l 
[20001: ISpringel et all 120081: iBovlan-Kolchin et al.l 120091) 7 
From the study of both cosmological and galaxy scale 
simulations, different parameterizations for a universal 
dark matter density profile have e merged. Einasto in- 
troduced an early parameterization (Einas tol [l96l [19681 
based on the Sersic profile for the light distribution in 
galaxies (Sersic 1968). Othe r dark matter profile param- 
eterizations have followed (iDubinski fc Carlbergl 119911 : 
INavarro et allll997l : IMoore et al.lll998bt ). While each pa- 
rameterization has found some level of success at describ- 
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ing the distribution of mass on the scales of galaxy clus- 
ters, understanding the extent and shape of galaxy-sized 
dark matter halos has met with mixed success. 

Observationally, the study of dark matter halos in 
spiral galaxies has outpaced that of ellipticals. This 
is largely due to the presence of extended HI discs 
found in spiral galaxies which provide a clean dy- 
namical tracer to several effective radii (iRubin et al.l 
119801: Ivan Albada fc Sancisil 119861: I Jimenez et al.l 12003ft . 
Analysis of the circular velocity curves of spiral 
galaxies provides some of the strongest evidence for 
the existence of dar k matter on galaxy scales (see 
iSofue fc Rubin! I2001L for a review). Lacking the 
extended HI discs seen in spiral galaxies, progress 
towards constraining the extent and distribution of 
dark matter in elliptical galaxies has proven a greater 
challenge. Despite t his complicat i on, evidence from 
gravi t ational lensing jKeetonl l200lt i Mandel baum et al.1 
120061: ISand et al.l 120081: iCarrasco et all 12010ft. X-r ay gas 
profiles (Humphrey et al.l 120061: iChurazov et al.l 120081: 
iDas et al.ll2010D , planeta ry nebulae (PNe) and globular 
clust e r (GC) kinemat i cs (|C6te et al1l2001t iDouglas et al.l 
I2007t iCoccato et~aT1 120091) an d integ r ated light stel- 
lar kinematics (I Bender et al.| Il994t lEmsellem et al.l 



2001 iCappellari et al.1 12000: iThomas et all 120071: 



Weiimans et al.M2009l: iForestelll [200JD has shown that 



elliptical galaxies are typically dark matter dominated 
beyond R ~ 1.5 R e . However, not all galaxies studied 
show d efinitive evidence for the existence of dark 
matter ([Gerhard et al.1 120011: iRomanowskv et al.l 12001 
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iMoni Bidin et al, 2010) and the best choice of dark halo 
parametrization remains elusive. These open questions 
leave key components of our theories of the growth of 
structure, galaxy formation and evolution largely in the 
dark. 

Comparison between the results of various mass es- 
timation methods return agr eement for cert a in sys tems 
and disagreement for others. iCoccato et al.l (|2009l ) find 
good agreement between integrated stellar light absorp- 
tion line kinematics and PNe data for a sample of 16 
early-type galaxies. Yet in other systems the agree- 
ment is poor. In an analysis of NGC1407, the cen- 
tral elliptical galax y in a nearby evolved galaxy group, 
iRomanowskv et all ((20091 ) find a discrepancy between 
the mass profile determined from GC kinematics and the 
profile determined b y X-ray gas. For the br ightest cluster 
galaxy in Abell 3827 lCarrasco et al.l (|2010l ) determine an 
enclosed mass via strong lensing that is 10 x higher than 
the mass determined from X-ray measurements. Mass 
discrepancies extend to tracers o ther than X-ray gas. 
Stella r kinematics of NGC82 1 fromlForestell fe Gebhardl 
(poTof) and NGC3379 from IWeiimans et"aIT7 2009) dis- 
agree with the PNe measurements of IRomanowskv et al.l 
(2003|). 

Each of these methods for estimating mass brings 
its own set of advantages, assumptions and limitations. 
Mass estimates based on X-ray gas have the advantage 
of very extended coverage, providing spatial overlap be- 
tween the other methods. Yet X-ray gas analysis is lim- 
ited to massive galaxies and commonly assumes hydro- 
static equilibrium of the gas. Strong lensing mass esti- 
mates avoid this potential pitfall as it makes no assump- 
tions regarding the energy distribution of the material 
within the lens. However, lensing is limited in its flexibil- 
ity, as the regions of the universe available for exploration 
are dictated by the fixed geometry of the lens and source. 
Velocity dispersion measurements from integrated stellar 
light are effectively available for all local systems, but re- 
quire a parameterization of the dark halo and involve as- 
sumptions about the degree of triaxiality of the system. 
There is also the challenge of getting stellar kinemat- 
ics at large radii where the dark halo comes to domi- 
nate the mass. PNe and GCs have an advantage here as 
they typically extend to large radii, yet whether these 
tracers follow the same dynamical history, and there- 
fore probe the same formation history as the stars, is 
not clear for all systems. A natural approach is to com- 
bine various data sets and methods in order to apply the 
strengths o f one method to ov e rcome the shortcomings 
of ano ther. iTreu fe Koopmansl (|2004j ) and iBolton et al.l 
( 2008) take this approach to good success by using both 
lensing and stellar kinematic s to break the well known 
mass-anisotrop y degeneracy (|Dejonghe fe Merrittl 119921 : 
IGerhardlfl99l . 

We focus here on the dark matter distribution in the 
giant elliptical galaxy M87, the second-rank galaxy in 
the Virgo Cluster. M87 has been extensively studied 
and a number of groups have made estimates of the ex- 
tent of M87's mass profile with a variety of methods. 
Empirical formulas, based on the virial theorem and 
measurements of the central stellar velocity dispersion, 
returned som e of the earliest mass e s timates for M87 
(lPovedalll96ll: iBrandt fe Rooserjiri969t iNieto fe Monnetl 
11984 ). iSargent et al.l ( 19781 ) used stellar velocity disper- 



sion measu rements extending t o R ~ 0.7i? e and the pho- 
tometry of lYoung et al.1 (|1978l ) to calculate the mass-to- 
light ratio (M/L) as a function of radius and estimate 
enclosed mass. Since that time, other mass estima t es of 
M8 7 using X-ray gas dFabricant fe Gorens tcin 19831 iTsal 
119961 : iMatsushita et al.l 120021: iDas et all 120101) and Gg 
kinematics dHuchra fe Brodid 119871; iMould et all 119871: 
iMerritt fe Tremblavlll993l) have been made. A compar- 
ison of these values to the mass estimate made in this 
work is given in §5.21 

The outline of the paper is as follows. In §2] we give 
the details of the data sets used in our dynamical model- 
ing, with specifics on the VIRUS-P instrument given in 
^2.41 An overview of the data reduction steps is given in 
§2( with the complete details provided in the Appendix. 
§3.11 explains the extraction of the line-of-sight velocity 
dispersion profile and §3.31 provides details of the selec- 
tion of template stars and their application. In §3 we 
explain the orbit-based dynamical models. In §3] we give 
the results of our dynamical modeling, with a discussion 
of our enclosed mass estimates and a comparison of the 
logarithmic and NFW halos found in §5.11 and £ 15. 21 We 
explore possible systematics in §5.41 

We assume a distance to M87 of 17.9 Mpc, correspond- 
ing to a scale of 86.5 pc arcsec -1 . 

2. DATA 

We make use of 3 sets of kinematic data to dynami- 
cally model M87. At large rad ii (140" < R < 5 40") we 
use globular cluster kinematics ((Cote et al.ll2001f). Stellar 
kinem atics from the SAURON data set (|Emsellem et al.l 
12001 are used within the central 13". New stellar kine- 
matics, taken with VIRUS-P (jHill et all l2008bl ), cover 
4" < R < 238" and add substantially to the two- 
dimensional spatial coverage of the galaxy. We provide 
details of the stellar surface brightness and globular clus- 
ter data in ij2.ll The SAURON stellar kinematics are 
discussed in H2.2I In M2.3I we describe the observations 
made with VIRUS-P. §JLl gives details of the VIRUS-P 
spectrograph and 8 32.51 explains the data collection. 

2.1. Photometry and Globular Cluster Kinematics 

The application of both the stellar surface 
brightness profile and glo bular cluster data follow 
IGebhardt fe Thomas! ((20091) (h ereafter GT09). The V - 
band photometry comes from iKormendv et al.( (1 2009). 
which is a combination of HST data from lLauer et alJ 
(1992) and various ground-based observations. This 
photometry extends from 0.02" to 2200". As the 
dynamical modeling requires the stellar surface density, 
the surface brightness profile is deprojected following the 
method of iMagorrianl (|1999f ). Our globular c luster sur- 
face density profile comes from iMcLaughlir] (|1999f ) and 
is deprojecte d via a nonparametric spherical inversion as 
described in IGebhardt et all (119961). The globular clus - 
ter vel o cities are re ported in iCohen fe Rvzhovl (|1997|) . 
iCohenl (120001) and Hanes et all (|200lD and compiled 
in iCote et all (|200lf T We employ the same cuts to 
remove fore ground and back ground contamination as 
described in lCote et al.1 ((20011 ). These cuts leave us with 
278 globular cluster velocities which we divide into 11 
modeling bins. A line-of-sight velocity dispersion profile 
(LOSVD) is then determined from all globular clusters 
within one modeling bin as described in GT09. 



The Dark Matter Halo of M87 



3 



2.2. SAURON Stellar Kinematics 

The SAURON data set provides two-dimensional spa- 
tial coverage of M87 out to nearly 40" with superior spa- 
tial resolution to VIRUS-P. We therefore use SAURON 
kinematics in the central region of M87. Once the size of 
the modeling bins makes the SAURON spatial resolution 
irrelevant (R > 8") the VIRUS-P data is used. We elect 
to use both SAURON and VIRUS-P kinematics between 
8" < R < 13" as described in gJJ 

The publicly available SAURON kinematics are 
parametrized by the first 4 coefficients of a Gauss- 
Hermite polynomial expansion. As our dynamical mod- 
cling fits the full LOSVD rather than its moments we re- 
construct the LOSVD via Monte Carlo simulations based 
on the errors provided by SAURON. The details of this 
reconstruction can be found in GT09. 

2.3. VIRUS-P Stellar Kinematics 

The VIRUS-P data were taken during three separate 
observing runs over 10 partial nights in January 2008, 
February 2008 and February 2009. VIRUS-P has no 
dedicated sky fibers. Therefore, sky nods are necessary 
and constitute approximately one-third of our observ- 
ing time. All our VIRUS-P data for M87 were acquired 
through a cadence of 20 minute science exposures brack- 
eted by 5 minute sky nods. We note that while not hav- 
ing dedicated sky fibers presents issues with determining 
the correct level of sky subtraction, sampling the sky with 
all 246 fibers allows us to better match the PSF varia- 
tion from fiber-to-fiber while not adding substantially to 
our photon noise. A discussion of both the advantages 
and drawbacks of sky nods, an d the details of our sky 
subtraction method are given in IA.2I 

The VIRUS-P data for M87 consists of 5 pointings ex- 
tending to 238.0" (20.6 kpc). The pointing placements 
are shown in Figure |4] Exposure times and radial dis- 
tances for each pointing are given in Table [T] Ten of the 
51 science exposures were taken under marginal condi- 
tions and withheld from the final data set as they de- 
graded our signal-to-noise (S/N). The exposure times 
quoted in Table Q] include only the data that went into 
the final spectra and subsequent modeling. 

2.4. The VIRUS-P Instrument 

The Visible Integral-field Replicable Unit 
Spectrograph- Prototype (VIRUS-P), currently de- 
ployed on the Harlan J . Smith 2.7 m telescope at 
McDonald Observatory (jHill et al.1 l2008bl) is a pro - 
totype for the VIRUS instrument (|Hill et all 120061 ). 
VIRUS is a replicated, fiber fed spectrograph currently 
under development for the Hobby Eberly Telescope 
Dark Energy eXperiment (HETDEX) (jHill et al.l 
l2008al) . Origin ally designed to conduct a Lyman-alph a 
emitter survey ([Adams et al.ll2010t [Blanc et al.l l2010af l. 
the VIRUS-P spectrograph is proving an excellent 
stand-alone ins trument for a wide range of scien- 
tific problems (lAdams et al.l 12001 IBlanc et al.l 120091: 
Yoachim et al.l 120091 : iBlanc et al.ll2010bl : lYoachim et al.l 
20101 ). VIRUS-P is a gimbal- mounted integral field unit 
spectrograph composed of 246 optical fibers each with a 
4.1" on-sky diameter. The CCD is a 2048 x 2048 back- 
illuminated Fairchild 3041 detector. The wavelength 
range for these observations is 3545-5845 A. The fibers 



are laid out in an hexa gonal array, similar to Densepak 
(jBarden fc Wa de 1988), with a one-third fill factor and 
a large (107" x 107") field of view. The large fibers 
and field of view make VIRUS-P an extremely efficient 
spectrograph for observing extended, low surface bright- 
ness objects such as the faint outer halos of elliptical 
galaxies. Gimbal-mounted directly to the barrel of the 
telescope, VIRUS-P maintains a constant gravity vector. 
Extensive analysis of the fiber-to-fiber wavelength 
solution and fiber spatial PSF has been conducted and 
shows negligible evolution over a night. To quantify 
the evolution, the location of the centers of the fibers 
from the twilight flats taken at the start and end of the 
night are compared and found to deviate < 0.1 pixels 
for all nights. The wavelength solution determined 
from the arc lamps taken at dusk and dawn are also 
compared. Typical residuals of the wavelength solution 
to known arc lines show an rms scatter of ~ 0.05 A 
for frames taken at the same time of night. This value 
of rms scatter does not increase when arcs from both 
dusk and dawn are combined. The one exception occurs 
with large temperature swings (> 10° C). Thermal 
contraction or expansion of the input and output ends 
of the fiber bundle can lead to a change in position and 
stress pattern on individual fibers. Localized pressure on 
a fiber can lead to focal r atio degradation (jCraig et al.l 
119881 : iSchmoll et all I2003T ) resulting in changes to the 
fiber position and spatial PSF over a night and increased 
RMS scatter in the wavelength solution residuals. These 
effects are subtle, yet can degrade the quality of our 
flat-fielding. Therefore, if a temperature change > 10° C 
is seen over a night, the data is split into two groups 
and reduced using the calibration frames taken at 
the closest temperature. We found this approach was 
necessary for two nights in our January 2009 observing 
run. However, even when a steep temperature gradient 
is seen, wavelength and flat-field calibration frames 
are necessary only at the start and end of a night's 
observing. 

The median spectral resolution for this VIRUS-P data 
is 4.75 A FWHM as determined from Gaussian fits to 
strong emission lines in the arc lamp frames. This reso- 
lution corresponds to an instrumental dispersion (sigma) 
of - 150 km s" 1 at 4060 A and - 112 km s" 1 at 5400 A. 
VIRUS-P was refocused between our January/February 
2008 and February 2009 observing runs which led to a 
non-trivial change (AFWHM ~ 0.5A) in the instrumen- 
tal resolution. As we frequently combine the spectra from 
different fibers and different nights, the change in instru- 
mental resolution is taken into account when extracting 
the stellar LOSVDs. The details of how differences in in- 
strumental resolution are handled can be found in 

The assumption of a Gaussian spectral PSF for 
VIRUS-P proves to be a good one. To quantify this, 
we fit Gauss-Hermite coefficients to 4 bright lines in our 
mercury-cadmium arc lamp frames for all 246 fibers. 
Over the 4 spectral lines and all fibers the median H3 
coefficient is 0.003 ± 0.013. The median H4 coefficient 
is 0.0003 ± 0.0117. Any non-Gaussian line behavior is 
further mitigated by the high dispersion of M87, which 
puts us well above the instrumental resolution. 

2.5. Data Collection 
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Calibration frames, taken at the start and end of each 
observing night, consist of a set of twilight frames, mer- 
cury and cadmium arc lamp frames and bias frames. The 
twilight frames are used for both flat-fielding and deter- 
mining the position and shape of each fiber profile. The 
arc lamp frames are used for the wavelength solution 
and determination of the instrumental resolution, (see 
IA.1I for more details) . The remainder of an observing 
night involves a sequence of 5 minute sky nods and 20 
minute science frames. The sky nods were taken 30' off 
the galaxy center in a region of sky with minimal field 
stars and continuum sources and where the galaxy has a 
surface brightness of /Xb ~ 26.5 (|Kormendv et alj |2009). 
While this position still includes intracluster light known 
to extend across m uch of the core of the Virgo Cluster 
(jMihos et al.ll2005l ). the contribution to the total flux is 
very low. 

3. DATA REDUCTION OVERVIEW 

We provide a brief overview of the data reduction pro- 
cess here, up through extraction of the kinematics. The 
extensive details can be found in the Appendix. 

The primary data reduction steps are completed with 
Vaccine, an in-house data reduction pipeline developed 
for VIRUS-P data. The reduction steps are as follows. 
All of the science, sky and calibration frames are over- 
scan subtracted. A master bias is created by combining 
all the overscan-subtracted bias frames taken during an 
observing run. The arcs and twilight flats are then com - 
bined using the biweight estimator (Be ers et al.lll990h . 
A 4 th order polynomial is fit to the peaks of each of 
the 246 fibers for each night. We refer to this as the 
fiber trace. This polynomial fit is then used on each 
science and sky frame to extract the spectra, fiber by 
fiber, within a 5 pixel wide aperture centered around 
the trace of the fiber. The wavelength solution is de- 
termined for each fiber, and for each night, based on a 
4 th order polynomial fit to the centers of known mer- 
cury and cadmium arc lamp lines. The twilight flats are 
normalized to remove the solar spectra. These normal- 
ized flats are then used to flatten the science and sky 
data. Once the frames are flattened, the neighboring 
sky frames are appropriately scaled, combined, and sub- 
tracted from the science frames. Cosmic rays are located 
and masked from each 20 minute science frame. For the 
dynamical modeling, the galaxy is divided into a series of 
line-of-sight radial and angular spatial bins. Therefore, 
fibers that fall within a spatial bin are combined. This 
step leaves us with individual spectra for 88 different spa- 
tial bins. Of these 88 spectra, the 8 central spectra are 
withheld from the dynamical modeling, as the SAURON 
data have superior spatial resolution in the central re- 
gion. The next step before the data is ready to model is 
the determination of the line-of-sight velocity dispersion 
profile, described below. 

3.1. Extraction of the LOSVD 

Our method for determination of the line-of- 
sight velocity disper sion profile (LOSVD) f ollows 
iGebhardt etail ((20001 ) and iPinknev et ail ([200l . We 
give an overview of the method here. 

To begin, an initial guess for a non-parametric LOSVD 
of the stars is made. This LOSVD is distributed into 29 



velocity bins and then convolved with a set of 12 template 
stars taken from the Indo-US template library. Selection 
of the template stars is discussed in £ )3.3I The continuum 
is divided out of both galaxy and template spectra prior 
to fitting. The fitting routine works by allowing both 
the weights given to each of the 29 velocity bins and the 
weights given to each template star to vary. A parameter 
to allow for an adjustment to the overall continuum of 
the template stars is also allowed to float. Minimization 
of the residuals of the fit of the convolved stellar tem- 
plate spectra to the galaxy spectra is used to determine 
the best LOSVD for that given spatial bin and spectral 
region. 

One of the great advantages VIRUS-P provides in the 
extraction of the LOSVD and subsequent error estimates 
is its wide wavelength range (~ 2200 A). The wide wave- 
length coverage allows us to determine the best LOSVD 
in five different wavelength regions. Of the 5 spectral 
regions sampled (Table [2]), 4 of the spectral regions are 
used in the final modeling. The Ca H + K spectral re- 
gion (3650-4150 A) proves difficult to fit and exhibits a 
large systematic offset in all of the first 4 moments of 
the LOSVD from the other 4 spectral regions, likely due 
to issues with the continuum division. This region is 
therefore not included in the determination of the final 
LOSVD and error estimate. 1 The final LOSVD is cre- 
ated by taking the average of the 4 LOSVDs within each 
of the 29 velocity bins. Figure [6] shows two of the final 88 
LOSVDs, with errors, for a bin at R = 24" and R = 174". 
Overplotted in these figures are the LOSVDs from the 4 
spectral regions used to generate the final LOSVDs. 

A smaller systematic offset was observed for the Mg b 
spectral region (see Figures [8] and [9]) . Yet unlike the Ca 
H + K offset, which stems from the difficulty in deter- 
mining the placement of blue continuum, we believe this 
offset is inherent to the Mg b spectral region and there- 
fore elect to include it in our final LOSVDs and subse- 
quent modeling. This decision was made as a trade-off 
between the ~ 10% offset in velocity dispersion seen with 
this spectral region, and the mitigating effects a 4 th spec- 
tral region has on the statistics of the final LOSVD and 
uncertainty estimates. We note also that by including 
the Mg b spectral region, our claim of a massive dark 
matter halo is strengthened as the direction for the Mg b 
offset is towards lower velocity dispersions. We pick up 
this discussion in ^3.51 

3.2. Uncertainty Estimates 

Error estimates for the best-fit LOSVD for each spatial 
bin are determined in two ways. The first is made via 
Monte Carlo simulations while the second is an empirical 
method that makes use of the wide wavelength coverage 
of VIRUS-P. Then, for each velocity bin in each LOSVD, 
the largest of the uncertainties is taken as the uncertainty 
for that velocity bin. Both methods are described here. 

The first error estimate is made by a Monte Carlo boot- 
strap method for each of the 4 spectral regions used in 

1 Since the completion of the dynamical modeling, the contin- 
uum normalization issue experienced with the Ca H + K region has 
been solved. However, this region is not included in the dynamical 
models as the cost of re-running 1000's of models is prohibitive. 
We note Figure [8] where the Ca H + K region is included in the 
analysis of the systematic offset seen in the Mg b spectral region. 
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the final LOSVD. The best-fit convolved LOSVD and set 
of weighted template stars provide the starting point for 
100 Monte Carlo realizations. Each realization involves a 
randomly chosen flux value, drawn from a Gaussian dis- 
tribution, for each wavelength. The mean of the Gaus- 
sian distribution is the flux from the best-fit convolved 
template spectra, and the standard deviation is set as 
the mean of the pixel noise values for that spatial bin as 
determined in the Vaccine reductions. A new LOSVD 
is determined for all 100 realizations and provides a dis- 
tribution of values for all 29 velocity bins in the best-fit 
LOSVD. The error estimate is the standard deviation of 
the 100 realizations within each of the 29 velocity bins. 
This Monte Carlo simulation is run on all 4 spectral re- 
gions and returns 4 error estimates for each of the 29 
velocity bins in each of the 88 spatial bins. 

The second method for estimating the uncertainty is 
made by calculating the standard deviation of the 4 
LOSVDs within each of the 29 velocity bins. This error 
estimate, combined with the 4 from the Monte Carlo sim- 
ulations, gives us 5 estimates of the uncertainty within 
each of the 29 velocity bins of the LOSVD. The largest 
uncertainty at each of these steps is taken as the final 
uncertainty used in the dynamical modeling. We note 
that both the Monte Carlo and empirical method for de- 
termining the uncertainty return similar results, with the 
empirical method typically being larger. 

3.3. Stellar Template Library 

The template stars used in the extraction of the 
LOSVD come fro m the Indo-US spectral library 
(jValdes et all 12004). The 12 stars in our final template 
library (Table [3]) were chosen from an initial list of 40 
stars selected to cover a range in stellar type and metal- 
licity. These 12 stars were selected from the initial list as 
they returned the lowest residuals when fit to the spec- 
tra while still maintaining a good range in stellar type. 
As the resolution of the template stars does not match 
the instrumental resolution of VIRUS-P, we must con- 
volve the template stars with the instrumental resolu- 
tion of VIRUS-P. The instrumental resolution varies both 
between fibers and, as the instrument was refocused in 
April 2009, between observing runs. A further complica- 
tion is that spectra from several fibers are often combined 
to reach the desired S/N. For overlapping pointings this 
combination can involve spectra from opposite ends of 
the CCD where the instrumental resolution can be dif- 
ferent by as much as 0.7 A FWHM. For a galaxy like M87, 
with velocity dispersions around 300 km s _1 , the error 
introduced by ignoring this difference is small (~ 2%). 
A simple solution, particularly given M87's high velocity 
dispersion, would be to convolve all the spectra to the 
lowest instrumental resolution. However, as we are in- 
terested in developing data reduction methods to accept 
all of the galaxies in our sample, we avoid degrading our 
resolution to the lowest value in the following manner. 

The instrumental resolution is calculated from Gaus- 
sian fits to 8 unblended arc lines from the arc lamp cal- 
ibration frames taken each night. As the instrumental 
resolution values are noisy, particularly at weaker spec- 
tral lines, a small, smoothing boxcar (5 fibers wide) is 
run along the spatial direction. Measurements of the fo- 
cal ratio degradation of the VIRU S-P fibers show min i- 
mal fiber-to- fiber variation (< 2%) (|Murphv et alll2008l) . 



As focal ratio degradation is the dominant characteris- 
tic of an optical fiber impacting instrumental resolution, 
differences in the instrumental resolution across the spa- 
tial direction of the chip are due to optical effects after 
the light exits the fiber. As resolution changes stem- 
ming from optical effects should be continuous, a boxcar 
smoothing of the instrumental resolution values is justi- 
fied. Differences in the calculated instrumental resolu- 
tion from night to night over an observing run are ~ 1% 
and so one instrumental resolution map is made for an 
entire observing run. The worst instrumental resolution 
over our data set is 5.0 A FWHM at 4060 A and 4.4 A 
FWHM at 5673 A. Once an estimate of the instrumen- 
tal resolution for every fiber and for each observing run 
is made, the instrumental resolution for each fiber going 
into a spatial modeling bin receives a normalized weight 
based on the number of exposures going into the final 
spectra. This approach gives more weight to fibers that 
provide more weight to the final spectra while accounting 
for differences in instrumental resolution between fibers 
and observing runs. Due to M87's high velocity disper- 
sion, this step amounts to a negligible change in the final 
LOSVD. 

Initially, we explored using template stars taken with 
VIRUS-P to avoid the complications of convolving the 
template spectra with the instrumental resolution. The 
results achieved by this method proved less robust for two 
primary reasons. First, the S/N of the Indo-US spectra 
is very high. While it is certainly possible to reach this 
S/N with VIRUS-P, there are observing time costs to 
consider. As using template stars taken with the instru- 
ment is effective only if we are able to fully sample the 
instrumental resolution across the CCD, many exposures 
on the same template star are necessary. The second 
limitation is the variety of stellar types available during 
an observing run. Although some variety in stellar type 
and metallicity is achievable, significant observing time 
would be lost in attempting to build up a sufficiently 
diverse stellar library. 

3.4. Moments of the LOSVDs 

In Figure [7] we plot the first 4 Gauss- Hermite moments 
of the LOSVDs from each of our 88 spatial bins. The 
colored diamonds indicate the angular position on the 
galaxy, with black along the major axis followed by blue, 
green, orange and red falling along the minor axis. For 
visual clarity, error bars are plotted only for data along 
the major axis. The error bars along the other axes 
are of comparable size. The vertical dashed lines indi- 
cate where the SAURON kinematics are used over the 
VIRUS-P data in the dynamical modeling. Overplot- 
ted with open diamonds are moments from the best-fit 
logarithmic model at each spatial bin, after averaging 
over the angular bins. To minimize visual confusion, the 
model fits have not been plotted in the central region. 

3.5. Systematics in Stellar Kinematics 

We have found a systematic offset between our mea- 
surement of velocity dispersion when compared to the 
SAURON data set. The offset is localized around the 
Mg b lines. Figure [§] plots the velocity dispersion mea- 
sured for the combined VIRUS-P wavelength regions 
used in the dynamical modeling (red circles) and the ve- 
locity dispersion calculated from just the Mg b region 
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(green diamonds). Also plotted are the SAURON veloc- 
ity dispersions for M87 (black squares). The SAURON 
spectral range is 4810-5310 A and shows a similar off- 
set to the VIRUS-P Mg b spectral region. To highlight 
this difference we have plotted, in Figure the VIRUS- 
P spectra for 5 spectral regions, along with the template 
fits (red) and calculated velocity dispersion for each. For 
this particular spatial bin at R = 24.1" the velocity dis- 
persion determined from the Mg b region is lower than 
the mean of the other 4 regions by ~ 30 km s _1 . This 
offset is not an outlier as can be seen in Figure [9l To 
place a number on this offset we note that the average 
velocity dispersion of all the VIRUS-P data points be- 
tween 7" < R < 36" is 301.8 km s^ 1 when all 4 spectral 
regions used in the dynamical modeling are included as 
described in i)3.1l The average velocity dispersion when 
using just the VIRUS-P Mg b region over the same spa- 
tial range drops to 281.8 km s _1 . Over the same spatial 
region (7" to 36") the average SAURON velocity disper- 
sion is 287.0 km s _1 . 

The cause for this offset is unknown and we do not 
attempt a detailed analysis of the offset here. Consid- 
ering the good agreement between the SAURON and 
VIRUS-P results for the Mg b spectral range, and the 
different methods used by both data reduction pipelines 
to extract stellar kinematics, the offset is likely intrin- 
sic to this spectral region. The issues surrounding 
the Mg b spectral region for determination of the ve- 
locity dispersion of elliptical galaxies, and the correla- 
tions with both gal axy luminosity and velocity disper- 
sion a r e well known dTerlevich et al.lll98lHDressler et al.l 
19871 IWorthev et al.1 119921: IKuntschner et al.l 120011 ). 
Barth et al.l (|2002l ) compare the velocity dispersion val- 
ues measured from the Mg b and Ca triplet spectral re- 
gions for a sample of 33 local galaxies. They find that 
the Mg b region is more sensitive to changes in the fitting 
procedure than the Ca triplet region, and exhibits an off- 
set in velocity dispersion for 48% of the galaxies in their 
sample, yet with roughly equal numbers of galaxies show- 
ing higher velocity dispersion values from either one or 
the other spectral region. Barth et al. also compare the 
velocity dispersions of their Mg b region calculated when 
both including and excluding the 5150-5210 A spectral 
window. For 32 of their 33 galaxies they find lower ve- 
locity dispersion values when this region is suppressed 
from the fitting, with a clear trend towards a larger off- 
set with higher galaxy velocity dispersion. We have ex- 
plored this trend by suppressing a similar spectral region 
(5150-5220 A) from our fitting and find similar results; 
velocity dispersion values calculated from the VIRUS-P 
spectra where the Mg b lines are withheld from the fit- 
ting are systematically lower than when these lines are 
included. However, the magnitude of our offset is small 
(~ 3 km s _1 ) and is ~ 10% of the offset seen by Barth 
et al. This discrepancy in the magnitude of the absolute 
offset value is likely due to differences in the two kine- 
matic extraction routines used. Interestingly, it is in the 
opposite direction as naively expected from a comparison 
of the SAURON and VIRUS-P Mg b regions as exclud- 
ing the Mg b lines leads to lower velocity dispersions, not 
higher ones. This suggests that the driving force behind 
the overall offset between the Mg b spectral region and 
the other 4 VIRUS-P spectral regions is not driven pri- 



marily by fits to the Mg b lines, but rather springs from 
issue in fitting that spectral region as a whole. A sys- 
tematic study of various kinematic fitting methods over 
different spectral regions would be highly illuminating. 

4. DYNAMICAL MODELS 

We employ axisymmetric orbit-based d ynamical mod- 
eling b ased on the idea first presented in iSchwarzschildl 
(|1979() . The specific det ails of our axisymmetr i c mod - 
eling can be f ound i n IGebhardt et all (I2000T, 12003), 
iThomas et all (pool I2005D and iSiopis et al.1 (2009). 
The models have been shown accurate to ~ 15% 
for recovery o f the dark matter halo parameters 
(IThomas et al.l I2005D and stellar M/L (|Siopis et alj 
120091) . Several other groups have developed their 
own mod eling based on Schw a rzschi ld's o rbit-based 
metho d. iDressler fc Richstond (|1988f ) and iRix et afl 
(|1997h developed an orbit-based dynamical model- 
ing code for spherical sys t ems. Ivan der Marel et al.1 



( 19981 ). ICretton et all ([1993) . IGebhardt et al.1 (|2000l) and 
iVerolme fc de Zeeuwl (|2002|) gener a lized t o axisymmetric 
systems and Ivan den Bosch et al.1 (|2008l ) has developed 
a triaxial code. Now a number of groups have employed 
Schwarzschild' s orbit-based method for black hole mass 
determination (|Cretton et al . 1999; Verolme &: de Zeeuwl 
l207^ICaroellari et al.ll2002t IGebhardt fc Thomasll2009D . 
stellar orbital structure and dark ma t ter content 
(ICretton et all 120001 IGebhardt et all 120031 iCopin et alj 
20041 iKrainqvic et all 120051 iCappellari et all 120061 



Tho mas et al . 2007: lForestellfcG ebhardtJ l2010l) 



We give an outline of the modeling procedure here. 
First, the galaxy's surface brightness is deprojected into 
a three-dimensional luminosity density. An edge-on in- 
clination is assumed and so the deprojection is unique. 
Next, a trial gravitational potential is determined based 
on the three-dimensional light distribution and an ini- 
tial guess for the stellar M/L, central black hole mass, 
and the dark matter halo parameters. Our orbit library 
is the same as used in GT09. The galaxy models ex- 
tend to 2000" over 28 radial, 5 angular, and 15 velocity 
bins. The gravitational potential and force are calcu- 
lated on a grid that is 5 times finer than the grid used 
to compare to the data. On average, 25000 orbits are 
run in the trial gravitational potential. A superposition 
of these orbits is created that is both constrained by the 
light density profile and is a best match to the kinematic 
data. The superposition is accomplished by giving each 
orbit a weight as determined by maximizing the func- 
tion S = S - ax 2 - Here, S is an approximation to the 
Boltzmann entropy, % 2 is the sum of squared residuals 
between the model and data LOSV Ds (Eqs. 5 and 6) , 
and a is a smoothing parameter. See ISiopis et al.l (2009) 
for a detailed description of both the creation of the or- 
bit library and determination of the orbit weights. The 
steps above are then repeated for a different model, each 
with a different stellar M/L, dark halo circular velocity 
and scale radius. 

Three types of models are run. First, we ran a set 
of dynamical models with no dark matter halo. As the 
only free parameter is the stellar M/L, only 100 models 
are needed to fully explore the parameter space. For the 
cored logarithmic halo (Eq. 3) we ran 6500 dynamical 
models. Over 8500 models are run assuming an NFW 



The Dark Matter Halo of M87 



7 



dark halo profile (Eq 4). For each model a distinct set 
of orbital weights is used and takes approximately 1.5 
hours of cpu to run. We use the Lonestar computer at 
the Texas Advanced Computing Center (TACC) at The 
University of Texas, Austin to complete all our dynami- 
cal modeling. 

4.1. Model Assumptions 

We calculate three types of dynamical models, each 
assuming a different mass distribution. First, we consider 
a mass model for M87 with no dark matter halo. The 
mass distribution (p) for these models takes the form 



p(r) = Ti/(r) + M.5(r) 



(1) 



where T is the stellar M/L, v is the three-dimensional 
light density and M, is the black hole mass. As the black 
hole is better constrained from GT09 we set our black 
hole mass to 6.4 x 10 9 Mq for all our dynamical models. 
Gebhardt et al. (2010, submitted) has refined the black 
hole mass estimate of M87 to 6.6(±0.4) x 10 9 M , yet 
this small change is within our uncertainties and does 
not change our results. 

Both the second and third sets of dynamical models 
include a parameterization for a dark matter halo. The 
mass distribution then becomes a sum over each of the 
mass terms as follows 



p(r) = Tv(r) + M m 6(r) + pDu(r) 



(2) 



where the first two terms are the same as in equation 
1, and /Qdm (r) is the dark matter density term. Two 
different parameterizations for the dark matter halo are 
explored. The first is a logarithmic dark matter halo 
with a density profile as given by 



/0dm 0") cx v\ 



2rl 



( r 2 + r 2y 



(3) 



The logarithmic halo features a flat central density core 
of size r c and an asymptotically constant circular veloc- 
ity, v c . The second dark matter density pa rameterization 
is a N avarro-Frenk- White (NFW) profile (N avarro et al.l 
1996) as given by 



Pdm(t, 



(r/r s )(l + r/r s ) 2 



(4) 



The NFW halo diverges like r -1 towards the center and 
drops as r -3 with radius. The concentration (c), scale 
radius (r s ) and the virial radius (r v ) are related via c = 
r v /r s . Both dark matter param eterizations ar e inclu ded 
in the modeling as described in lThomas et al.l ([2005). 

4.2. Modeling the Stars and Globular Clusters 

The spatial grids used for the modeling are the same 
as in GT09. The spatial binning is split into N r = 28 ra- 
dial and Ng = 5 angular bins. Where the model bins are 
larger than the SAURON bins, we re-bin the SAURON 
data. The re-binning is accomplished by taking the 
average of all the LOSVDs falling within one model 
bin, weighted by their uncertainties. This complication 
doesn't arise with the VIRUS-P stellar data as we sim- 
ply combine all the spectra from all fibers that fall within 
a given model bin before extraction of the LOSVD. For 



the central model bins (R < 8") we elect to use just 
SAURON data for its superior spatial coverage. Between 
8" < R < 16" we use both SAURON and VIRUS-P 
data. We do not combine these data, but rather send in 
two LOSVDs independently into the dynamical modeling 
routines. A total of Ng ars = 25 + 80 LOSVDs (SAURON 
+ VIRUS-P) are used in the modeling, with each stellar 
LOSVD, £ stars , sampled by N vcl = 15 velocity bins. 

To determine the best-fit model, a x 2 minimization is 
run in each trial potential. The \ 2 is calculated as 




(5) 



Here, £f? odc » is the i th model LOSVD in the j th ve- 
locity bin. The orbit model is forced to reproduce u, 
the stellar density, to machine precision. The residuals 
between the model and actual set of 105 LOSVDs are 
minimized for a single model potential, yielding a single 
Xstars value - 

As the globular clusters can have a different orbital 
structure than the stars, they are treated as a separate 
kinematic component. The GCs are handled in a similar 
fashion as the stars, with the difference that we employ 
a deprojected number density for the GCs rather than 
the stellar luminosity density as for stars. Both the stars 
and GCs are then treated as massless test particles that 
orbit in a potential established by the assumed BH mass, 
stellar M/L and dark halo parameters. The weighted 
orbit superposition in each trial potential is determined 
by minimizing a similar equation as for the stars, namely, 



Xgc 




(6) 



where £ GC are the N^ c = 11 globular cluster LOSVDs 
built up from individual GC velocities as described in 
£13.11 As with the stellar density, v, the GC number 
density is reproduced to machine precision. 

4.3. x 2 Analysis 

A x 2 analysis is used to determine both the best-fit 
modeling parameters and their uncertainties. We can 
rule out a model with no dark matter with high con- 
fidence. The best-fit no dark matter model returns a 
stellar M/L v = 11.4. However, the x 2 minimum for this 
model is 4898, which is a Ax 2 > 3571 increase over ei- 
ther of the best-fit models including dark matter. We 
do not discuss these models further. The best-fit mod- 
els for both the logarithmic and NFW halos returns x 2 
minima of 1299.4 and 1310.1 respectively. The Ax 2 of 
10.7 between the two dark matter parameterizations is 
statistically significant when comparing the different con- 
straints we get on the stellar M/L. However, we do not 
get a constraint on either of the NFW dark halo parame- 
ters, concentration and scale radius. This is clearly seen 
in the lower, right panel of Figure [T0l where no clear x 2 
minimum for scale radius is seen out to 350 kpc. As our 
kinematic data does not extend beyond 50 kpc we should 
not expect to get a constraint much beyond this radial 
distance. As we do not constrain the NFW dark halo, 
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we focus here on the logarithmic halo results for our dis- 
cussion of the x 2 analysis, and refer to the NFW results 
where appropriate. 

To select the best-fit dynamical model we analyze the 
X 2 values returned from each model run. The x 2 values 
plotted in Figure \TU\ are the additive combination of the 
X 2 values of both stars and GCs, namely, x 2 — Xstars + 
Xqc ■ Figure [10] plots these x 2 values against the three 
model parameters for both the logarithmic and NFW 
dark halos. Each point gives the x 2 value from a single 
dynamical model. The logarithmic dark halo parameters 
are plotted on the left. The solid red line is a cubic spline 
fit to the lowest x 2 values along the parameter space. 
The dashed blue line shows the x 2 minima coming from 
just the stars. For plotting purposes, an additive shift of 
41.5 has been given to the dashed blue line. As the shift 
is additive, the relative x 2 values are preserved. 

On the right side in Figure [TO] we plot the x 2 values 
for the NFW models. We do not get a constraint on the 
NFW dark halo scale radius and concentration param- 
eter. This is evident in the lower-right panel of Figure 
[TO] where the x 2 minimum runs unconstrained to r s val- 
ues, well beyond the extent of our kinematic coverage. 
As the NFW concentration parameter is related to the 
scale radius as c = r v /r s , we also do not constrain this 
parameter. 

A total of 105 stellar LOSVDs and 11 GC LOSVDs 
are used in the dynamical modeling. Of the 105 stellar 
LOSVDs, 25 are determined from the 4 SAURON mo- 
ments which provides 25 x 4 = 100 parameters. The 80 
VIRUS-P LOSVDs used in the modeling are fit to 15 
velocity bins, giving 80 x 15 = 1200 more parameters. 
The 11 GC LOSVDs are constructed from 4 parameters, 
giving another 11 x 4 = 44 parameters which totals to 
1344 for each dynamical model. The best-fit dynamical 
model for a logarithmic halo had a x 2 = 1299.4, giving a 
reduced x 2 value of 0.97. The x 2 minimum for the NFW 
halo was 1310.1, which gives a similar reduced x 2 value. 

The constraints on stellar M/L come predominately 
from the stars, yet the GCs help to constrain the high 
M/L end as can be seen in the top- left panel of Figure 
[TUl This result is not surprising. The GC kinematics 
constrain the total enclosed mass in the outer model- 
ing bins; their kinematics strongly influence the resulting 
dark matter halo mass. As we assume a constant M/L for 
the stars, mass not accounted for in the dark matter halo 
must get accounted for in the stars and drive the M/L 
to higher values. Therefore, kinematics that constrain 
the dark halo will also constrain regions of the model- 
ing where that mass would otherwise wind up, namely 
higher values for the stellar M/L. 

In the lower-left panel of Figure [TO] we see a different 
influence of the GC kinematics stemming from their ex- 
tended spatial coverage. Constraints on higher r s values 
come from the GC kinematics, which extend to 47 kpc. 
This result is expected, as the stellar kinematics do not 
extend out to the dark halo scale radius and can therefore 
not influence the modeling. Clearly the GC kinematics 
are important for constraining the dark halo parameters, 
and the good agreement between the best-fit stars + GC 
model and the stars-only model, where the kinematics 
overlap, is reassuring since it implies that both large radii 
stars and globular clusters are in dynamical equilibrium. 



Further evidence for equilibrium between the large radii 
stars and GCs is seen in the excellent agreement in their 
velocity anisotropy (see i)5.3l and Figure IT5j) . 

The degree to which the GCs help to constrain the dark 
matter profile can be seen in another light in Figure [TO] 
where we plot the enclosed dark matter fraction for the 
logarithmic dark matter halo. The solid red line shows 
the dark matter fraction when including both GC and 
stellar kinematics in the analysis. The dashed blue line 
comes from an analysis of the stars only. It is clear that 
kinematics at large radii are essential to a robust deter- 
mination of the dark matter fraction at all radii beyond 
the central 0.3 R e . 

The Ax 2 = 1 range gives us the 68% confidence 
bands for each of the three parameters. For the loga- 
rithmic dark halo, the best- fit stellar M/L is 9.1±gJ ( v 
band). The best- fit dark matter halo circular velocity 
is 800^25 km s _1 , and dark matter halo scale radius is 
radius of 36^ kpc. The NFW dark halo, while not con- 
strained, still gives a robust estimate of the stellar M/L 
of 8.20to °Q. The difference in these stellar M/L values 
is driven entirely by the shape of the assumed dark halo, 
and that the dynamical models work by constraining to- 
tal enclosed mass. As the NFW halo allows for a higher 
central concentration of mass, and the stellar M/L is 
assumed constant as a function of radius, mass can be 
taken up by the cuspier NFW profile, thus lowering the 
M/L of these models. 

5. DISCUSSION 

The parameters of the dark halo from this paper are 
different than the ones presented in GT09 which is also 
based on a stellar dynamical analysis. GT09 also fit a 
cored logarithmic dark matter halo yet find a circular 
velocity that is 10% lower and a scale radius that is 60% 
lower than these results. The reason for the difference is 
due to the datasets; the data presented here have sub- 
stantially improved kinematic coverage for the stars. The 
stellar kinematics of GT09 end at 33" whereas our cov- 
erage extends to nearly 240". The GC data is identical 
between the two papers. The large gap in kinematic spa- 
tial coverage in GT09 between 33" and 140" leads to 
generally poor constraints on the dark matter halo pa- 
rameters. The new VIRUS-P data closes this gap and is 
therefore more robust. 

5.1. Enclosed Mass 

The best-fit dark matter halo parameters for a cored 
logarithmic profile returns 800i 2 g km s _1 for circular ve- 
locity, and 36^3 kpc for the scale radius. In terms of 
enclosed mass, M87's dark matter halo is one of the 
largest ever measured for an individual galaxy. Figure [T2l 
plots enclosed mass for our best-fit logarithmic and NFW 
models. The black and red lines, with uncertainty, plot 
total enclosed mass for the logarithmic and NFW mod- 
els respectively. The inclusion of a 6.4 x 10 9 M black 
hole keeps the total enclosed mass from reaching zero at 
R = 0. The uncertainties are the min/max values for the 
4 2 = 16 dynamical models that explore the parameter 
limits of our 68% confidence bands. For the uncertainty 
in the black hole mass we use the ±0.5 x 10 9 M Q values 
from GT09. The stellar component is plotted in green 
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(dot-dash) with uncertainties within the thickness of the 
lines. The dark matter profiles are plotted in gray (long 
dash). The yellow vertical line shows the extent of our 
kinematic data. The comparison between the enclosed 
mass model from the best-fit logarithmic and NFW ha- 
los shows good agreement to the end of our stellar kine- 
matics. The NFW enclosed mass profile then begins to 
diverge to lower total enclosed mass to the end of our 
kinematic coverage. We discuss how our results compare 
to other mass estimates for M87 in the following section. 

5.2. Comparison to Other Mass Estimates 

At larger radii iDoherty et al.1 (|2009( ) have measured 
kinematics of PNe for M87. They find a dark halo con- 
sistent with the one presented here inside of 500", al- 
though since their radial range is 400" to 2500" there 
is not much spatial overlap with our current data set. 
At around 600" they find that the mass density begins 
to decrease strongly, leading to a truncation of M87's 
dark halo. At R = 1500", their outermost radial bin, 
the PNe dispersion they measure is 78 ± 25 km s . For 
the spatial overlap between our work and theirs (400" 
to 540"), where we are now comparing globular clusters 
and PNe, the kinematics disagree. Possible reasons for 
the disagreement are that the GC kinematics in this re- 
gion are poorly measured or that the GCs are not in dy- 
nami cal equi l ibrium (e.g . from a recent m erger event). 
Both ICohenl (pOOOl) and ICote et all (|200l ) find the GC 
population around M87 shows both chemical and kine- 
matic evidence for two distinct populations of GCs. An- 
other possibili ty is that the PNe mea surements are biased 
in some way. IDoherty et all (pool ) exclude 3 of 8 PNe 
for their R = 800" bin as intracluster planetary nebula 
and not tracing the potential of M87. Including these 3 
PNe raises their measured dispersion from 139 km s _1 
to 247 km s _1 . Certainly a comparison to cither GC 
or stellar kinematics at this radial position would be en- 
li ghtening. 

IWu fc Tremaind (|2006[ ) estimate the enclosed mass 
of M87 at 32 kpc (35.1 kpc at our assumed distance) 
to be 2.4(±0.6) x 10 12 M using GC kinematics and 
assuming spherical symmetry. Our mass estimate of 
3 - 64 ±a65 x 1012 M © (logarithmic halo) at this radial 
position falls wit hin the uncertainties, yet with a n off- 
set of ~ 34%. IRomanowskv fc Kochane"k| (120011) . us- 
ing stellar kinematics fr om Ivan der Marell (|1994D and 
ISembach fe Tonrvl (|1996f ). and GC kinematics from sev- 
eral sources, derive an enclosed mass profile for M87 that 
shows a similar offset towards lower total mass over the 
range 1 R e < R < 5 R e . Within 1 R e their models diverge 
to ~ 50% lower total mass. This discrepancy may be due 
to the stellar kinematics used over this radial range. The 
stellar kinematics of Sembach & Tonry exhibit a system- 
atic offset from other data sets for which Romanowsky 
& Kochanek make a correction. The offset between our 
enclosed mass and theirs within 1 R e may be due to this 
effect or due to the different modeling assumptions, as 
Romanowsky et al. assume spherical symmetry for their 
modeling. The discrepancy may also come about due to 
the increase in spatial coverage the VIRUS-P data affords 
over their long-slit spectroscopy. 

Compar is on of the X-ray mass determination from 
iDas et all (|2010|) to our mass profile from stars 



and GCs shows good agreement over the range 
4 kpc < R < 20 kpc, yet diverges elsewhere (see Figure 
IT4|) . At both larger and smaller radii the mass profile 
from X-rays is lower than that determined by the stars 
and GCs. At R = 3 kpc the X-ray estimated mass is down 
by 50 % and at R = 2 kpc the disagreement is ~ 70%. A 
similar discrepancy is seen at larger radii. The enclosed 
mass from X-rays at R = 47 kpc, our furthest data point, 
is lower by 50% than our best-fit value . This difference is 
similar to the one seen in NGC4649 (IShen fc Gebhardti 
I2010T ) . One possible explanation for this discrepancy 
comes from allowing for a turbulent component in the 
X-ray gas. A 50% decrease in enclosed mass can be ex- 
plained by a ^30% non-gravitational component in the 
gas. This amount of of difference is similar t o the the- 
oretical expectation of IBrighenti fc Mathewsl (12001D an d 
has been seen in similar systems ( jChurazov et alj feOlO). 
More analysis on a wider set of galaxies is necessary to 
fully understand the source of these differences. 

In Tableland Figure [14] we compare enclosed mass es- 
timates from the literature to this work. Our logarithmic 
and NFW halo mass profiles are plotted as in Figure Q21 
Each colored symbol in Figure Q3] indicates the methods 
employed to determine the enclosed mass. In general, we 
find a more massive dark halo for both our logarithmic 
and NFW parameterizations, although our enclosed mass 
values at various radial positions are not consistently the 
highest reported in the literature and appear consistent 
with the scatter of the data seen in Figure Q3J 

5.3. Stellar Anisotropy 

The mechanisms by which mass accumulation occurs in 
galaxies l eave their mark on the distribution fu nction of 
the stars (|Lvnden-Belllll967t IValluri et alj|2007l) . There- 
fore, mapping the anisotropy of both the stars and GCs 
can address questions surrounding galaxy formation his- 
tory and evolution. Our orbit-based dynamical modeling 
return the stellar orbital structure, which we summarize 
in Figure [T5l Plotted is the average velocity anisotropy 
over the 20 angular modeling bins of both the stars and 
GCs. The uncertainties are calculated in the same way as 
described in Figure [T2l and the text. Within R ~ 0.5 R e 
the stars show radial anisotropy, then become mildly tan- 
gentially anisotropic to the last stellar data point. The 
excellent agreement between the stars and GCs in this re- 
gion is indicative of dynamical equilibrium between these 
two components. Although we do not conduct a detailed 
analysis of the anisotropy of M87 here, comparison of 
anisotropy maps to N-body simulations can be highly in- 
fo rmative. An example of such an analysis can be found 
in Hoffma n et al.l (120101) where the dynam ical modeling 
of NGC4365 by Ivan den Bosch et al. (2008) is compared 
N-body simulations. 

5.4. A Discussion of Systematic Uncertainties 

Given the high S/N of our data, we pay particular 
attention to quantifying systematic uncertainties, since 
they might be important for the reported uncertainties. 
As we are using A\ 2 to determine the parameter values 
and uncertainties, if we do not have proper uncertainty 
estimates for the kinematics we will bias our final mod- 
eling results. There are three internal consistency checks 
that demonstrate that our uncertainties are properly es- 
timated. 
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First, we estimate LOSVDs and Gauss- Hermite pa- 
rameters from 4 different wavelength regions. Comparing 
the standard deviation across the four regions to the in- 
dividual uncertainties from the Monte Carlo simulations 
provides a consistency check. We find that, in general, 
these two uncertainties estimates are consistent. The 
large wavelength range of VIRUS-P provides this very 
important estimate, which includes both statistical and 
systematic effects. 

Second, the reduced y 2 for the best-fit dynamical mod- 
els is near unity. The x 1S measured from the LOSVDs, 
and we can see the agreement in the plot of observed and 
modeled moments (Figure [7]) . The deviations between 
the data and the modeling moments are consistent with 
the stated uncertainties. While this consistency does not 
directly show that systematic effects are not an issue, it 
is an indirect confirmation. 

Third, when comparing kinematics from datasets taken 
at different times we find consistent results within the 
stated uncertainties. With the spatial overlap of our 
pointings #3 and #4 we are able to compare the re- 
sulting kinematics from four of our spatial bins when 
taken a year apart. We have compared the first four 
Gauss-Hermite moments, calculated from our extracted 
LOSVD, and find that they are all consistent within their 
stated uncertainties. These three internal checks demon- 
strate control of the measured uncertainties. 

Next, we discuss the two areas where systematic ef- 
fects may be important: sky subtraction and template 
mismatch. In order to determine how the level of sky sub- 
traction affects our extracted kinematics and subsequent 
modeling, we explore both over and under subtraction of 
each 20 minute science frame. A range of sky subtrac- 
tion levels are created and taken through all subsequent 
data reductions. A total of 25 different sky subtractions 
are made on each science frame, over a range of ±12.5% 
when compared to equal exposure times. The details of 
these reductions are given in the Sky Subtraction sec- 
tion in the Appendix. We then compare the calculated 
velocity and velocity dispersion values, taken from the 
best-fit LOSVD. This comparison, over all 88 spectra, 
shows no systematic offsets in velocity or velocity dis- 
persion for either over or under subtraction of the night 
sky. The associated random errors for this full range 
of sky subtractions is within our quoted uncertainty for 
both velocity and velocity dispersion. 

In order to explore possible systematics due to our 
use of the Indo-US spectral library, we select the same 
set of template stars from th e Miles spectral library 
(|Sanchez-Blazquez et al.l |2006[ ) and extract kinematics 
for all of our spectra. The two libraries agree very well, 
with deviations between the libraries of ~ 2.5km s _1 , 
well within our quoted uncertainties for velocity disper- 
sion. In the case of velocity, there is a slight offset 
(~ 7km s _1 ) which is due to the lack of a velocity zero- 
point between the two libraries. Both of these checks 
indicate that our systematics are under control. 

5.5. Next Steps 

This work points the way to several other areas of in- 
quiry. We have explored two different parameterizations 
for a dark matter halo, yet others exist and there is no 
reason to dismiss any of them. A natural next step is 
to rigorously explore several different dark matter halo 



parameterizations with the same data sets and model- 
ing methods to determine which, if any, is favored. This 
requires us to push the collection of stellar kinematics 
to ever larger radii. The amount of observation time 
needed to reach to 2.4 R e with VIRUS-P was not sub- 
stantial, and stellar kinematics to 3 and 4 R e are achiev- 
able. These data would allow for both better constraints 
on the various dark matter halo parameters and a com- 
parison with the other dynamical tracers (i.e. GCs and 
PNe) . As much of our current understanding of the dark 
matter halos around elliptical galaxies depends on GC 
and PNe kinematics, a robust comparison between each 
tracer is needed to explore systematics. 

A second avenue of exploration comes from the 
information contained in the stellar chemical abun- 
dances available t hroug h a Lick index analysis. 
iGraves &: Schiavonl (|2008l ) provide a publicly available 
tool that is well-suited for this work. How elliptical 
galaxies formed and whether their stars were formed in 
situ or accreted ov er time requires both a dynamical and 
chemical analysis (|Graves fc Faberl [2010). The chemi- 
cal composition of GCs at larg e radii have been studied 
(|Cohenll2000t iCote et alJl200l . and a detailed compar- 
ison of both the kinematics and chemistry of both GCs 
and stars at the same radial position should prove im- 
mensely fruitful. 

Finally, work towards a more complete and uniform 
sample of massive elliptical galaxies, both 1 st and 2 nd 
rank galaxies, and equally massive field ellipticals (e.g. 
NGC1600) is needed to explore the influence of environ- 
ment on dark matter halos. Several groups have made 
significant progress towards this end, yet the data sets 
that involve both 2D spatial coverage at both small and 
large need to be expanded. 
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Fig. 1. — An image of VIRUS-P data from a 20 minute central pointing on M87 (#3 in FigureUJl after all preliminary data reduction is 
complete. Just the central ~ 90 fibers are shown. The inset shows a close-up of 11 fibers extracted over a 5 pixel- wide aperture. Residuals 
from the 5577 A sky line subtraction can be seen to the far right. The strong absorption feature seen on the far left is the G-band (~ 4310 A 
rest frame). The weaker absorption band near the center, just to the left of the small box, is H« (~ 4860 A) and the strong, wide feature 
towards the right is the Mg b region (~ 5167 to 5183 A). 

APPENDIX 
DATA REDUCTION 

The reduction of integ ral field spectroscopy ( IFS) data involves numerous issues not faced in the reduction of 
traditional long-slit data dBarden fe Wadd fjJ88t iParrv fc Carrascdll990l : iWvse fc Gilmorelll992t iBarden et all 119931 : 
iLissandrini et al1 IT994; Wat son et a JI1998T). Each fiber exhibits its own character, with variations in spa tial PSF, trans- 
mission and focal ratio degradation ( Avilalll988tlRamseyll988l : [Bershadv et al.ll2004tlMurphv et al.ll2008| ). Despite these 
complications, many group s have developed robust and versatile pipelines for th e reduction of IFS data ()Bacon et all 
120011: iZanichelli etaLll2005t iTurner et afll2006t lSanchedl2005 iSandin et ai1l2010D . 

This paper is the first in a series and establishes our principle methods of data reduction. For this reason we give a 
detailed description of each step in the data reduction process. In the description below, the term "spectral" is used in 
indicate the wavelength or X-direction on the CCD. The term "spatial" is used for the cross-dispersion or Y-direction. 



Reduction Details 

The preliminary data reduction uses Vaccine (jAdams eTaD[2Qi3), an in-house pipeline developed for reduction of 
VIRUS-P data. We give a full account of the Vaccine data reduction steps here. First, the overscan and bias are 
subtracted from all frames. The CCD is very clean, therefore no masking of bad pixels is conducted. Next, the 
twilight flats and arc lamp frames are combined with the biweight estimator (| Beers "etaTlfl990h . The biweight is used 
at several steps in the reduction process. 

Due to issues of instrumental alignment and the inherent limitations of all optical elements, curvature in both the 
spatial and spectral directions on the CCD is unavoidable. The curvature along the spatial direction is handled by 
allowing each fiber to have its own wavelength solution. In order to correct for the curvature along the spectral 
direction the twilight flats are used to locate the centers of each fiber and determine the fiber trace. To accomplish 
this, a 21 pixel-wide boxcar is run along a single fiber of the twilight flat in the spectral direction. The fiber profile 
in the spatial direction is super-sampled with the boxcar and fit with a Gaussian profile to determine the center of 
the fiber at each pixel step in the spectral direction. This boxcar method effectively smooths over fiber spatial profile 
variation due to solar absorption features and pixel-to-pixel flat-field variation while giving a robust estimate of the 
location of the center of the Gaussian profile. As the curvature in the spectral direction is not extreme (~ A5 pixels 
from the center of the CCD to the edge of the 2048 x 2048 chip) a 21 pixel-wide boxcar smoothing is justified. The 
location of the centers of all the Gaussian profiles for a single fiber are then fit with a 4th order polynomial. The 
polynomial fit becomes the trace of the fiber. The steps described above are repeated for all 246 fibers. 
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Fig. 2. — Spectra from three of the 88 spatial bins located at R = 97.1", 189.2" and 238.0". The counts are a biweight combination of 
the CCD counts in ADU, after sky subtraction, over 20 fibers, 72 fibers and 38 fibers respectively. The typical level of the night sky is 
shown for comparison. The night sky subtracted from the most distant pointing at R = 238" is ~ 5 times brighter than the galaxy. 

Once the fiber trace is determined for all 246 fibers, the fiber profile is extracted, fiber by fiber, over a 5 pixel-wide 
aperture. Figure [T] shows an image of the central ~ 90 fibers of a typical frame after extraction. As the fiber centroid 
moves from one row of pixels to the next the 5 pixel extraction aperture follows. By allowing the extraction aperture 
to make discrete steps between rows of pixels we avoid interpolation of the data. There are two advantages to not 
resampling the data at this step. First, we avoid introducing the correlated noise inherent to interpolation and can 
therefore carry accurate pixel-to-pixel noise calculations through the final step of the Vaccine reductions. This is 
helpful as a proper S/N calculation is necessary for the Monte Carlo error estimations made later in the reductions 
(i j3.2p . Second, interpolation can artificially broaden the spectra, and while the dispersion of M87 is well above the 
instrumental dispersion for all our pointings, this should not be assumed a priori. 

The typical FWHM of a fiber profile along the spatial direction is ~ 4 pixels with an average spacing of ~ 8 pixels 
between the centers of adjacent fibers. We have measured the cross-talk between fibers to be < 1% over a 5 pixel- 
wide aperture. The fiber position on-sky is mapped onto the CCD from left-to-right and top-to-bottom (Figure [5]). 
Therefore, neighboring fibers on the CCD are typically neighboring fibers on the sky and, as neighboring fibers are 
often combined to reach the desired S/N, the effect of cross-talk is further mitigated. We explored extracting over a 7 
pixel- wide aperture and compared the final S/N of both extractions. Due to the low level of signal in the edges of the 
7 pixel aperture, the 5 pixel aperture returns better S/N and is used for all VIRUS-P data presented here. 

Mercury and cadmium arc lamp frames are used for wavelength calibration and afford 8 unblended and well-spaced 
emission lines over our wavelength range. The wavelength of each emission line has been confirmed using the Robert 
G. Tull Coude spectrograph on the 2.7 m telescope in the R = 60k set up. The wavelength solution for each fiber is 
determined as follows. For an individual fiber, each emission line is fit with a Gaussian profile to determine its center. 2 



2 We have characterized the spectral PSF of the VIRUS-P instrument and find it to be very nearly Gaussian. To quantify this, we fit 
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A 4 th order polynomial is fit to the centers of each emission line and the residuals between the actual wavelength and 
fit wavelength are minimized. The cadmium 3611.3 A line and the mercury 5769.6 A line are near the blue and red 
edges of our wavelength range and minimize the amount of extrapolation needed at the edges of the polynomial fit. 
Typical rms residuals of the polynomial fit are < 0.07 A or < 4.4 km s" 1 at 4800 A (FWHM) . Comparison of the 
wavelength solution from the arc lamps taken at the start and end of the night show differences well below the noise 
of the fit. We find that small linear shifts in the spectral direction of the fiber, possibly due to thermal variations at 
the output end of the fibers, can occur on the timescales of an hour. To account for this shift, a correction is made to 
the th order term of the wavelength solution based on the change in position of the bright 5577.34 A sky line. The 
average of this correction over all frames was 0.13 A with a standard deviation of 0.11 A. A heliocentric correction 
is made to each frame. For our February data this correction had a mean of 19.3 km s~ x and a standard deviation 
of 0.7 km s^ 1 . For our January data the mean heliocentric correction is 27.6 km s^ 1 with a standard deviation of 
0.2 km s _1 . 

The next reduction step involves creating a flat-field frame from the twilight flats. There are four different pieces 
of information combined in the twilight flats: pixel-to-pixel variation, fiber-to-fiber throughput variation, fiber cross- 
dispersion profile shape, and the twilight sky spectrum. The first three are aspects of the flat-field we want to preserve 
while the twilight sky spectrum must be removed. Our approach is to construct a model of the twilight sky, free of 
flat-field effects, and then divide this mo del out of the original twilight frame. To generate a model of the night sky 
we use a method similar to IKelsonl (|2003|) for IFS sky subtraction. We outline our method here. 

To model the twilight sky a 51 fiber- wide boxcar is run along th e spatial direction. As each fiber has a slightly 
different wavelength solution, a B-spline interpolation (jDie rckx 1991) of the pixel's wavelength is made, based on the 
wavelength solution determined by the arc lamp polynomial fits. By employing a B-spline interpolation, we are not 
limited by the pixelization of the wavelength solution. The 51 fiber-wide boxcar and 5 pixel- wide extraction aperture 
provides 51 x 5 = 255 estimates of the twilight flux at a given wavelength. Both pixel-to-pixel, fiber throughput, and 
fiber profile shape vary on scales much smaller than the size of the boxcar and are thus smoothed out. What remains 
is a model of the twilight sky with flat-fielding effects removed. The model is then divided out of the original twilight 
flat, leaving pixel-to-pixel, fiber throughput, and fiber profile shape intact. We attempted to avoid these complications 
through the use of dome flats, yet the intensity of the available dome lamps below ~ 4000 A is too low to determine 
an accurate fiber trace. Even if an acceptable light source was available, there is another issue with dome flats. It 
has been shown that the input acceptance angle is preserved through optical fibers and that focal ratio degradation is 
dependent on this angle (|Carrasco &: Parry||1994l : iMurphv et al.ll2008l ). As light entering the fibers from a dome lamp 
is not collimated, there is a concern that the fiber cross-dispersion profile is not being properly quantified with the use 
of dome flats. Twilight flats avoid both of these issues. 

Initially, the wavelength solution is estimated from un-flattened arc frames. This can lead to errors in the wavelength 
solution when an arc line falls on top of a flat-field feature. Therefore, the determination of the wavelength solution 
and subsequent derivation of the flat-field frame is iterative. The arc lamp frames are flattened, and the wavelength 
solution is recalculated. As the flat-fielding procedure relies on the wavelength solution, a new flat-field, based on the 
new wavelength solution, is also made. We find this iteration leaves the wavelength solution for most fibers unaffected, 
yet can improve the residuals by ~ 0.05 A for a handful of fibers where one or more of the arc lines used for the 
wavelength solution fall on strong flat-field features. A single iteration is all that is required. 

The flat-field frame captures the pixel-to-pixel, fiber-to-fiber and fiber profile shape variation for each fiber at very 
high S/N. However, due to thermal effects over an observing night, the science and sky frames can exhibit a shift in 
the position of the fiber profile. This shift manifests as a breathing mode and can reach up to a 0.3 pixel shift in the 
center of a fiber when the temperature gradient over the night is steep. A shift in the center of a fiber will lead to large 
flat-fielding errors if not accounted for. To correct for this effect we have developed a heuristic solution. The general 
idea is to measure the offset over a subset of fibers, then create a unique flat for each science and sky frame based on 
the master flat for that night. We will refer to this frame as the science flat and it is generated as follows. For each 
fiber in each science frame the difference between the fiber center of the master flat and science frame is calculated at 
all 2048 pixel positions. The median of these values for each fiber is taken, then smoothed with a 12 fiber-wide boxcar. 
As the breathing mode is smooth and continuous, and the signal in the science and sky frames can be quite low, this 
level of smoothing is both required and justified. 

The fiber profiles have shapes that deviate slightly from any simple parameterization. For all resamplings we employ 
a sine interpolation, chosen for its non-parametric properties. Simply resampling each fiber's flat will properly capture 
the shift in fiber position, but will improperly capture the pixel-to-pixel features. Therefore we run a 81 pixel-wide 
boxcar along the dispersion direction to isolate the fiber profile from the pixel-to-pixel variation. The original flat 
containing the proper pixel-to-pixel map and the resampled fiber profile are combined to form the final science flat. 
As sine interpolation is not flux-conserving, the science flat is renormalized to match the total counts in the original 
science frame. Once this unique flat is applied, we are left with science and sky frames that have been extracted, 
wavelength calibrated and flattened. The next step is sky subtraction. 

Gauss-Hermite coefficients to 4 bright lines in our mercury-cadmium arc lamp frames. Over the 4 spectral lines and all 246 fibers the 
median H3 coefficient is 0.003 ± 0.013, while the median H4 coefficient is 0.0003 ± 0.0117. 
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Fig. 3. — Top: The final VIRUS-P spectra for a spatial bin at R = 60". The black spectra is the biweight combination of 6 fibers 
over 12 exposures. Overplotted in green are the 12 sky spectra subtracted from each exposure prior to the biweight combination. The 
continuum has been normalized here for plotting purposes. Middle Top: The variance of the 12 sky spectra shown in the top panel. Note 
that while certain high variance regions are associated with night sky lines (e.g. 5200 A), others are not associated with any strong night 
feature (e.g. 4860 A). These variance plots are calculated for each spatial bin and used to help determine locations where the sky evolves 
on short time scales. Lower 2 Panels: Shown in black are the final spectra for two spatial bins at R = 13.2" (upper) and R = 222.0" 
(lower). Overplotted in red is the best- fit stellar template spectra. The gray areas in both lower figures indicate spectral regions that are 
suppressed when completing the kinematic extraction due to either issues with sky subtraction or template mismatch. We note that while 
these spectral regions are not fit when extracting the final stellar kinematics, the difference in the final LOSVD when the smaller regions 
(AA < 50 A) are not masked is minimal. Table [2] shows how the spectra are split into 5 spectral regions prior to stellar template fitting. 



Sky Subtraction 

Figure [2] plots M87 spectra from 3 different spatial bins. An estimate of a typical night sky spectra is shown for 
comparison. The spectra are shown in CCD counts and the relative flux between the spectra has been preserved. At 
1 R e the galaxy is still brighter than the night sky by about a factor of 2. By 2 R e the night sky is now more than 
twice as bright as the galaxy, and in our furthest spatial bin the sky is ~ 5 times brighter. A careful handling of sky 
subtraction at these low surface brightnesses is important, and we discuss our method in detail here. VIRUS-P does 
not have sky fibers and so sky nods are necessary. Lacking sky fibers has obvious drawbacks as we sample the sky at a 
different point in time, and loose science observing time to sky nods. Despite these disadvantages there are benefits to 
sky nods. One clear advantage to sky nods comes from the much improved noise statistics we get from sampling the 
sky with all 246 fibers. As our sky subtraction is done with a model of the night sky (described below) the addition 
of noise from the night sky estimate is reduced by y/N, where N is the number of fibers. In contras t, Densepak and 
Sparsepak on the WIYN te lescope dedicate 4.4% and 8.5% of their CCD to sky fibers, respectively, (jBarden fc Wadel 
1988; Bcrshadv ct al. 2004) while the SAURON instrument dedicates just over 10% of its lenslets to estimates of the 
sky (|Bacon et alj 12001). Sky nods also avoid the risk of cross-talk between the science and sky fibers, particularly 
when observing bright science targets. 

A more serious issue with dedicated sky fibers is their limited offsets from the center of the science portion of the 
integral field unit. The sky fibers for Densepak and Sparsepak are offset 60" and 70" from the center of the science 
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field. For SAURON this is increased to 154" yet this size of offset can prov e constraining for near by galaxy work. For 
a galaxy like M87, whose half-light radius is ~ 100" and possibly larger ()Kormendv et al.l l2009f ). the dedicated sky 
fibers are still collecting a significant amount of light from the galaxy itself, thus reducing the final S/N. It is clear 
that for M87, estimates of the sky at ~ 150" from the galaxy center will lead to subtraction of some level of galaxy 
light. How strong this effect is depends critically on how far out from the galaxy center the sky estimates come from. 

With exposure times of 5 minutes for the sky nods and 20 minutes for the science frames, weights are given to the 
sky frames. The weighted sky frames are then summed to produce an estimate of the night sky. If the night sky did 
not evolve over 30 minutes, then the best sky estimate would come from weighting each neighboring sky nod by 2 and 
summing them. However, the sky can evolve on time scales less than 30 minutes. To account for this, 25 different sky 
estimates are made, each created by giving different weights to the neighboring sky nods. These frames are then sent 
through all of the reductions independently. Five different weights are used for each sky nod, ranging between 1.75 to 
2.25 in 0.125 increments. With each sky nod receiving five different weights, the various combinations of sky nods lead 
to the 5 2 = 25 estimates of the night sky. Once the remaining data reduction is complete we have 25 versions of each 
science frame. This range allows us to analyze, in a very direct way, both the best sky to subtract from each science 
frame, and the influence of our sky subtraction on the final stellar kinematics. We describe here how the individual 
sky frames are created from the sky nods and then discuss how the best sky to subtract from each science frame is 
selected from the 25 options. 

For each of the 25 scaled sky frames bright continuum objects and cosmic rays are identified as 3<7 outliers above 
the median and masked. As a 3cr cut may not catch low level sources, a 51 fiber-wide boxcar is run over the frame. A 
boxcar of this size corresponds to smoothing over a 107" x 21" region of the sky, so even faint, extended sources are 
removed. From this frame the sky is modeled by the same method used to model the twilight sky during the creation 
of the flat-field. The principle difference is that rather than modeling the sky to divide out of the frame, the sky model 
is what we are after. 

The first step in determining the best level of sky to subtract is a visual inspection of the quality of subtraction of 
the night sky lines. However, the determination can not be made based solely on these lines as they evolve on very 
short time scales and independently of the thermal background that most strongly effects our estimates of the stellar 
kinematics. The second step is to conduct a preliminary fit of a single template star (HD20893) to the data. We 
outline the steps here, leaving the details of this fitting method to §3.11 First, for each of the 25 data frames, a set of 
fibers seeing a moderate level of galaxy light are selected and combined to form a single spectrum. The exact number 
of fibers and amount of galaxy light isn't critical as the final determination comes from a relative comparison of the 
results. In fitting the data with a template star, a convolution occurs between the template star and an assumed 
line-of-sight velocity dispersion profile (LOSVD), which accounts for broadening in the spectra due to the temperature 
of the galaxy at that location. Normally, a continuum offset for the template star is allowed to float when conducting 
the extraction of kinematics. However, for this step, this value is fixed to avoid possible degeneracies between this 
parameter and the level of sky subtraction. A comparison of the residuals of the fit between each of the 25 frames 
and the single template star is used to determine the best level of sky subtraction. For nearly all frames (~ 90%) the 
best sky subtraction comes from scaling each sky nod by 2 and summing them. The exception to this occurs primarily 
with exposures near dawn or dusk when twilight begins to affect the weighting. 

As the sky evolves on time scales shorter than 30 minutes, the accuracy of our sky subtraction is not perfect and 
certain spectral regions remain problematic, particularly around regions of bright sky emission lines. In order to get 
a handle on both the location and severity of these issues, we conduct a visual inspection of the data by overplotting 
the sky-subtracted galaxy spectra for a given spatial bin with each subtracted sky that goes into a given pointing. We 
show an example of the results of this inspection in Figure [3J In the top panel, the 12 different sky spectra subtracted 
from each 20 minute frame are overplotted on the resulting galaxy spectra. At each pixel the variance is calculated 
for the 12 sky spectra and plotted in the second panel. A region of high variance indicates a region where the night 
sky evolves substantially between exposures. Notice that the high variance tends to be, but is not limited to, spectral 
regions with bright sky lines. The lower two panels in Figure [3] show the final sky subtracted spectra, along with the 
best-fit stellar template, for two spatial bins at R = 13.2" and 222.0". The details of this fitting routine are given in 
£13.11 The gray areas indicate spectral regions excluded from the stellar template fitting. These regions are excluded 
when the template fit to the data is poor due to either issues with sky subtraction or template mismatch. 

With 25 different estimates of the level of sky subtraction we are in a good position to explore systematics due to 
either over or under subtracting the night sky. To do this we select a range of spatial bins at various radii and S/N. 
We then take the reductions up through extraction of the LOSVD and compare the moments of the LOSVDs from 
the 25 frames. Although variation between these frames is seen, it is both random and within the uncertainties of 
our analysis. This is particularly true for central regions of the galaxy where our S/N is high. At larger radii, where 
the galaxy light is faint, over or under subtraction of the night sky tends to wash out the signal entirely rather than 
introduce systematics. 

There is another component of the reductions that further mitigates error due to inaccurate sky subtraction. The 
final spectra is, at minimum, a combination of three separate frames, and up to 15 for the case of pointing #1. As the 
sky subtraction from each 20 minute exposure is independent, random poor sky subtraction is mitigated by having 
many frames. This mitigating effect gains strength for the more distant pointings, where the number of exposures 
increases. 
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Further Reductions 

Once the sky subtraction is complete, cosmic rays are located and masked. To locate cosmic rays, each pixel value 
is compared to the pixel values that fall along either the same row or column of the extracted frame. Comparison with 
pixel values along the same row avoids masking continuum sources while comparison along the same column avoids 
masking real galaxy emission features that will appear in neighboring fibers. A pixel found to be a la outlier in this 
comparison is masked, as well as all neighboring pixels. Any low level cosmic rays not masked in this step are rejected 
when the spectra from different exposures and fibers is combined. 

Next, fibers containing either foreground or background objects are located and masked. These fibers are identified 
by taking the median of the flux in each fiber and plotting these values against the position on the galaxy. As the 
median is taken over 5 rows x 2048 pixels = 10,240 values any residual cosmic rays or emission features will not 
influence this map. Foreground stars and background objects are located as outliers from the smooth continuum of 
the galaxy and masked. Although these objects will fall onto the same fibers for the same pointing, each science frame 
is inspected individually. We find this frame-by-frame check is necessary as transient objects, most notably satellites, 
can swamp an entire row of fibers. 

Low level background sources remain a concern for the most distant pointings where the surface brightness of 
M87 may approach the level of these sources. As the final spectra for the most distant bins comes from a biweight 
combination of between 30 to 70 fibers, and even a large background source will fall into just a few of our large fibers, 
any low level source will be rejected by the final biweight combination. 

Each fiber for each night has a unique wavelength solution. Therefore, a linear interpolation is required before 
combining spectra from different fibers. The fiber cross-dispersion profile shape in each science frame is removed 
via division by the flat, and so the spectra from each of the 5 rows of a fiber is used in the biweight combination 
independently. In the case of pointing #3, where we have 3 science exposures, a minimum of 15 estimates of the 
spectra go into the biweight ( 1 fiber x 3 e x posur es x 5 rows). The biweight estimator has been shown to be robust for 
samples smaller than 15 fSee lBeers et al.l ()1990[ ) for references). For our largest spatial bin, 72 fibers x 15 exposures 
x 5 rows = 5,400 estimates are sent into the biweight. Once this step is complete we are left with the 88 VIRUS-P 
spectra presented in this work. 



The Dark Matter Halo of M87 



19 




Fig. 4. — An SDSS image of M87 showing the positions of the 5 VIRUS-P pointings. Each 107" X 107" box consists of a hexagonal array 
of 246 optical fibers (see Figure [5j. The total exposure time for each pointing is given in Table 1. North is up and East is to the right. 



TABLE 1 
Exposure Times for M87 Pointings 



Exposure 


Observation 


Rmin Rmax 


Pointing Time (min) 


Date 


(") (") 



180 
120 
100 
60 
120 
240 



01- 08 

02- 08 

01- 08 

02- 09 
02-08 
02-08 



130.0 
130.0 

45.0 
0.0 

43.0 
127.0 



238.0 
238.0 
140.0 
73.0 
136.0 
203.0 



Note. — The exposure times, date of observation, and 
radial positions of the 5 VIRUS-P pointings on M87. The 
exposure times quoted are the total science exposures in- 
cluded in the final VIRUS-P data. Ten of the 51 expo- 
sures taken were withheld from the reductions based on 
analysis of the S/N of the resulting spectra. Sky nod ex- 
posure time is not included in these totals. All observing 
conditions were good to photometric, with typical seeing 
values of 1.5". These data were all taken within ±3 days 
of the new moon. 



20 Murphy, Gebhardt & Adams 



O O O O O O 

oooooooo 
)ooooooo 

300000000000 

oooooooooooooo 

ooooooooooooooo 

oooooooooooooo 

ooooooooooooooo 

oooooooooooooo 
ooooooooooooooo 

oooooooooooooo 
ooooooooooooooo 

oooooooooooooo 
ooooooooooooooo 

oooooooooooooo 
ooooooooooooooo 

oooooooooooooo 



Fig. 5. — The relative positions of the 246 fibers comprising pointing #4. The fibers are aligned in a hexagonal array with a one-third 
fill factor. Each fiber has a 4.1" on-sky diameter. 




Fig. 6. — Two LOSVDs from spatial bins at R = 24.1" (left) and R = 173.2" (right), plotted with their uncertainties. Overplotted with 
lighter colored lines are the LOSVDs from the four wavelength regions used to determine the final LOSVD. Seventy-nine of the 80 final 
modeling LOSVDs are constructed from 4 of the LOSVDs determined from the 4 spectral regions shown in Table [2] For one spatial bin 
the iron region (5300-5850 A) proved a poor fit and was withheld from the final LOSVD. 
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TABLE 2 
VIRUS-P Spectral Regions 



Wavelength Range (A) Spectral Features 



3650-4050 
4195-4585 
4455-4945 
4930-5545 
5300-5850 



Ca H, Ca K 
G-band 
H-bcta 
Mgl b 
Iron 



Note. — The 5 spectral regions chosen for 
extraction of the best-fit LOSVD for each spa- 
tial bin. The calcium H &; K region (3650- 
4050 A) is not used in the determination of 
the final LOSVD due to a systematic offset in 
the measured velocity dispersion as compared 
to the other spectral regions. This systematic 
is likely due to issues with the continuum nor- 
malization over the bl ue re gion of the spectra 
(see Footnote [T] and £|3. 51 for further discus- 
sion). 



TABLE 3 
Indo-US Template Stars 



ID Type V [Fe/H] 



HD 


50420 


A7III 


6 


.16 


0. 


.30 


HD 


78362 


F5III 


1 


.65 





.52 


HD 


5015 


F8V 


4. 


.82 


0. 


.00 


HD 


693 


F5V 


1 


.89 


-0 


.38 


HD 


39833 


G0III 


7 


.66 





.01 


HD 


161797 


G5IV 


3 


.41 


0. 


.16 


HD 


199960 


G1V 


6 


.21 





.11 


HD 


17820 


G5V 


8. 


.38 


-0 


.69 


HD 


20893 


K3III 


5. 


.09 


-0. 


.13 


HD 


6734 


K0IV 


6 


.46 


-0 


.25 


HD 


92588 


K1IV 


6 


.26 


-0 


.10 


HD 


130025 


K0V 


6 


.16 


-0 


.19 



Note. — The template stars used 
in the determination of the best-fit 
LOSVD. These 12 stars were selected 
from an initial list of 40 stars based 
on a minimization of the fitting resid- 
uals during the kinematic extraction. 
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Fig. 7. — The first 4 moments of the Gauss-Hermite expansion of the 88 VIRUS-P LOSVDs. The filled diamonds show the data at 
different angular bins. The black diamonds are for the major axis, followed by blue, green, orange, and with red along the minor axis. The 
dashed vertical lines near the center indicate the region where VIRUS-P data is not used and SAURON kinematics are employed in the 
modeling. The open diamonds, connected by a line, plot the moments, averaged over the angular bins, from the best-fit logarithmic dark 
halo model. 
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Fig. 8. — VIRUS-P data for 5 spectral regions from the spatial bin at R = 24.1". The black line plots the d ata and the red line plots the 
best-fit stellar template. The shaded gray regions are withheld from the kinematic extraction as discussed in £13. II The velocity dispersions 
measured for each of the 5 spectral regions are shown in the upper-left of each panel. The systematic offset between the Mg b region 
(cr = 278.7 km s — 1 ) and the other 4 spectral regions (with a mean of a = 308.5 km s — 1 ) is clear. The Ca H + K region, while initially 
withheld from the dynamical modeling due to a large systematic offset seen in its calculated Gauss-Hermite moments, is included here. 
The offset seen in Ca H + K was due to issues of continuum normalization. Since completion of the dynamical modeling this issue was 
resolved and can now be included in this comparison. 
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Fig. 9. — Velocity (top) and velocity dispersion (bottom) measurements from SAURON and VIRUS-P. Black squares plot the SAURON 
data and red circles the VIRUS-P data. The green diamonds show the velocity dispersion measured with VIRUS-P over the Mg b region. 
These velocity dispersion values are offset by ~ 20 km s~ 1 from the velocity dispersion values calculated when combining spectral regions. 
With a wavelength range covering the H-beta and Mg b spectral regions, the SAURON spectral range is similar to the VIRUS-P Mg b 
region. The agreement between the SAURON and VIRUS-P velocity dispersion values over this region (277.0 km s _1 and 281.8 km s 
respectively) is within our uncertainties. 
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Fig. 10. — The \ 2 values (stars + GCs) vs. the three modeling parameters for a logarithmic dark matter halo (left) and NFW halo 
(right). Each black dot is the x 2 value from a single model. To highlight the variation at the \ 2 minimum, only a few hundred of the 1000s 
of models run are shown. A smoothed spline fit to the minimum \ 2 values (plotted in red) gives us our 68% (Ax 2 < 1) and 95% (Ax 2 < 2) 
confidence bands. The dashed blue line plots the x 2 minimum values for the stars. An additive shift of 41.5 has been made to the stellar 
values. As the shift is additive, the relative x 2 values are preserved. The NFW halo results show the lack of constraint on the NFW scale 
radius parameter (lower right panel). We do not show similar spline fits to the NFW halo due to the unconstrained nature of the model. 
We note that while we do not constrain the NFW dark halo parameters, the constraint on the stellar M/L is very robust. The AM/L of 
1.1 between the logarithmic and NFW models is due in large part to the difference in inner slope of the assumed dark halo parameters. 
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Fig. 11. — Plots of the x 2 mi nim ums of the 3 parameters plotted against one another for the logarithmic dark matter halo. The x 2 range 
shown is the same as in Figure ITOl (left half). The small black dots show individual models that lie near the x 2 minimum. The larger black 
dots show models that fall within the 95% confidence band (Ax 2 < 2) while the larger red dots show models within the 68% confidence 
band (Ax 2 < 1). The modeling degeneracy between the dark and luminous matter, as discussed in GT09, is clearly seen in the correlation 
between the stellar M/L and the two dark matter halo parameters. 



TABLE 4 
Literature Mass Comparison 



Reference 


Method 


Symbol 


Radius 


Literature 


Logarithmic 


NFW 


(year) 






(arcsec) 


1O 12 M 


10 12 M Q 


10 12 M o 


Neito & Monnet (1984) 


Empirical 


circle 


490 


5.20 


4 90+ 1 - 14 
4 - MU -0.82 


n Q +139 


Brandt & Roosen (1969) 


Empirical 


diamond 


84 


2 7+ 1A 
z - ' -1.4 


4 (17+0.28 
4 - u ' -0.25 


4 cjn+0.49 
*-3U_ 34 


Poveda (1961) 


Empirical 


square 


84 


i.4lg:S 


42+0' 03 
u - 4z -0.03 


44+ 05 
u ' 44 -0.03 


Fabricant et al. (1983) 


X-rays 


circle 


1336 




17.9+j ^ 


18.3+1'i 


Huchra & Brodic (1987) 


GC kinematics 


circle 


248 


6.1+ 22 


1 61+ ' 35 


1 47+0.37 
-0.26 


Mould et al. (1987) 


GC kinematics 


diamond 


200 


o.9o±g*:« 


, ir-+0.22 
^ — 0.18 


-, 19 +0.25 
1 ' iz -0.17 


Sargent et al. (1978) 


Stellar kinematics 


diamond 


80 


1Q+ 010 
u ' iM -0.20 

14+ - 05 


n oq+0 03 
iJ.jy_o.o2 


41+0.05 
u - 41 -0.03 


Sargent et al. (1978) 


Stellar kinematics 


diamond 


17 


24+ 01 

u - Z4 -o.oi 


24+ 02 
U ' Z4 -0.01 


Tsai (1996) 


X-rays 


diamond 


266 


2.20 




1 61 + 042 
1 - D1 -0.29 


Merritt & Tremblay (1993) 


GC kinematics 


square 


603 




c cc+1-50 




Matsushita et al. (2002) 


X-rays 


square 


113 


0.43+1;° 


= t-o+0.58 
o.oo_ 50 


o,MZ -0.60 


Matsushita et al. (2002) 


X-rays 


square 


226 


i.oli:8 


1 QO+0.29 
l._9_ 24 


1 S0+ ' 32 
l.-U_ 2i 


Matsushita et al. (2002) 


X-rays 


square 


340 


2 0+ 10 
z - u -1.9 


9 71 +0.65 
z -' L -0.50 


2 26+ ' 69 
z - ZD -0.46 


Wu & Trcmaine (2006) 


GC kinematics 


circle 


406 


2 4+O.6 
Z ' 4 -0.6 


3 fi 4+0 87 
•3-0^-0.65 


2 qq+0 97 
z ' yJ -0.65 



Note. — Enclosed mass values from the literature. CI) Ref erence. C2) Method employed to determine 
enclosed mass. C3) Symbol used to plot the data in Figure [l4l C4) Radial distance from the center of the 
galaxy, scaled to the distance assumed in this work (R = 17.9 Mpc). C5) Literature value of enclosed mass 
at the radial position in C4. The uncertainty is quoted, where available. C6) Enclosed mass from the best-fit 
logarithmic halo model from this work. C7) Enclosed mass from the best-fit NFW halo model from this work. 
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Fig. 12. — Total enclosed mass as a function of radius. The solid, black line indicates total enclosed mass for our best-fit logarithmic 
model. The NFW enclosed mass profile is plotted in red. Both of these enclosed mass models are plotted with uncertainties, which are the 
min/max values for the 4 2 = 16 dynamical models that explore the parameter limits of our 68% confidence bands. The green lines plot 
stellar mass for both models (with uncertainties less than the thickness of the line) and the light gray lines, with uncertainties, indicate the 
two assumed dark matter distributions. Note the total enclosed mass does not go to zero with radius due to inclusion of a 6.4 X 10 9 Mq 
mass black hole. Modeling results beyond our last data point, indicated by the vertical yellow line, are not constrained by the data, and 
are therefore suspect. 




Fig. 13. — The enclosed dark matter fraction as a function of radius for a logarithmic halo. The red (solid) line shows the best-fit x 2 
model for both stars and globular clusters (i.e. x 2 = Xstars + Xoc)' blue (dashed) line shows the best- fit dynamical model based 

on the x 2 value for stars only (i.e. x 2 = Xstars)- This figure indicates the degree to which the large radii GCs influence the dark matter 
fraction at all radii. 
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Fig. 14. — A comparison of total enclosed mass from the literature. The symbols are explained in Table [4] The color of the symbols 
indicates the method employed to make the mass determination . B lue: empirical, Green: GC kinematics, Red: Stellar kinematics, Orange: 
X-rays. The red, black and yellow lines are described in Figure [T2l 
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Fig. 15. — The ratio of the radial velocity anisotropy to the tangential anisotropy for both the stars (red lines) and GC (blue dots). 



